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1 .  Introduction 

in  tills  note  we  consider  a  nonpar ame trio  approach  to  estimating  an  unknown 
probability  distribution  function,  or  equivalently,  a  reliability  function* 

That  is,  nothing  is  assumed  to  be  known  about  the  specific  form  or  parameters 
of  the  distribution.  Specifically,  nonparametric  empirical  Bayes  estimation 
will  be  considered  in  that  a  prior  distribution  over  the  space  of  all  probability 
distributions  is  assumed  to  exist  but  is  not  completely  specified.  Korwar  and 
Hollander  (1976,  1977)  have  taken  such  an  approach  based  on  the  nonparametric 
Hayes  estimation  of  a  distribution  function  given  by  Ferguson  (1973,  1974). 

We  will  present  two  additional  nonparametric  empirical  Bayes  estimators  of  a 
distribution  function,  examine  their  properties,  and  compare  them  with  the 
Korwar-Hollander  estimators.  These  estimators  appear  to  be  plausible  alternatives 
to  the  Korwar ^Hollander  estimators. 

Let  (P^,  X^) ,  i  =  1,  2,  ••»,  be  a  sequence  of  independent  random  elements, 
where  P^  are  random  probability  measures  on  the  real  line  and,  given 
P^  =  P,  **  (X^,  . ..,  Xim  )  is  a  random  sample  from  P.  Let  denote  the 

corresponding  random  distribution  function  for  each  P.,  i  *  1,  2,  ...  .  The  P^ 
are  taken  to  have  a  common  prior  distribution  given  by  a  Dirichlet  process  on 
the  measurable  space  (R,  B) ,  where  R  denotes  the  real  line  and  B  is  the  O-field 
of  Borel  subsets  of  R.  The  parameter  of  the  Dirichlet  process  will  be  denoted 
by  a(*)t  a  a-  additive  finite  nonnull  measure  on  (R,  B) .  (See  Ferguson's 
(1973,  1974)  papers  for  basic  definitions  and  properties  of  Dirichlet  processes.) 

We  consider  the  problem  of  estimating  the  distribution  function 


F  ( t )  *  P  00 »  t])  in  this  empirical  Bayes  framework  with  respect  to  the 

n+1  n+i 


1 


A  A  o 

loss  function  L(F,  F  )  =  /  (F(t)  -  F  (t)  I  dW(t),  where  W(t)  is  a  specified 

K 

* 

nonrandom  weight  function  and  F  is  an  estimator  of  F.  Korwar,  et  al  (1976, 

1977)  proposed  the  sequence  of  estimators 

(1.1)  «wl(.)  -  P  +  <1-P  )  n  -  1.  2 . 

n+1  n+1 

where  p  »  a(R)/[a(R)  +  m  ].  Exact  risk  expressions  were  obtained  and  the 
m  n 

n 

rate  at  which  the  overall  expected  loss  for  converged  to  the  minimum  Bayes 

risk  (attained  by  Ferguson's  (1973)  nonparametric  Bayes  estimators)  was  indicated. 
Here  two  other  sequences  of  estimators  are  proposed  and  their  asymptotic 
optimality  and  comparison  with  (1.1)  are  considered. 


2.  The  Estimators  and  Their  Asymptotic  Optimal! 


Let  M  e  {m^^}  represent  a  sequence  of  estimators  of  an  unknown  distribution 
i unction  F.  In  our  empirical  Bayes  framework,  Ferguson's  (1973,  p.  222)  Bayes 
estimator  of  F  based  on  the  (n+l)st  stage  sample  Is  given  by 


(2.1) 


Fm  (t)  “  F0(t)  +  (i-Pm  )  F  (t), 

mn+l  mn+l  °  mn+l  11+1 


where  Fq(0  ■  ot((-<»,t J)/a(R)  and  Is  the  sample  distribution  function  of 

Then  the  Bayes  risk  R^+^a)  (2.1)  is  given  by 

(2.2)  R  (a)  -  S  (/U  ,|  (F(t)  -  ?.  <t))2)dW(t)). 

“n+1  v  '  rt+l  n+1 

and  the  risk  of  M  ..  is 
n+I 

% 

K<*Wa)  -  EVl{/[ET(t)|Vl<F<t)  - 

Denote  the  expectation  of  with  respect  to  X^,  ...,  Xg  by  <0 
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Definition  2.1 .  The  sequence  M  *  {M^^} 
optimal  relative  to  a  if  R^^M,  oO/R^^a)  * 


is  said  to  be  asymptotically 
1  as  n  ■*  ». 


We  note  that  when  the  sample  sizes  at  each  stage  n  are  equal ,  then 

Definition  2,1  reduces  to  that  of  Korwar  et  al  (1976,  Definition  2.3) •  In 

tills  case,  R  . ,  (a)  *  R(a),  the  minimum  Bayes  risk  for  Ferguson's  estimator, 
n+l 

For  completeness  we  state  Lemma  2.5  of  Korwar  et  al  (1976). 


Lemma  2.1.  Let  F  be  a  Dirichlet  process  on  (R,  B)  with  parameter  a, 

and  let  X- ,  ...,  X  be  a  sample  of  size  m  from  P  with  distribution  function 
i  m 

F(t)  =  P((-°",  t]).  Let  F(t)  be  the  sample  distribution  function  of 

X  *  (X. ,  . . . ,  X  ) .  Then  for  each  t  e  R 
—  1  m 

E(F(t) | X)  *  F  (t) , 

—  m 

E(F(t))  =  FQ(t) , 

and 

E(F2(t))  -  F0(t)/m  +  (m-l)F0(t){F0(t)a(R)+l}/{m(a(R)+l)}, 

where 

/s 

F  (t)  *  p  F  (t)  +  (1-p  )F(t)  and  p  *  a(R)/[a(R)+mJ. 
m  m  U  m  m 


Korwar  et  al  (1977)  proved  the  following  theorem. 

Theorem  2.1  Let  a(R)  be  known.  Then  the  sequence  G  -  {G^j}  defined  by 
(1.1)  is  asymptotically  optimal  relative  to  a. 

We  now  introduce  two  other  sequences  of  estimators  which  seem  to  be 
natural  candidates  for  empirical  Bayes  estimation.  We  discuss  their  asymptotic 


4 


risk  behavior  and  in  Section  3  consider  some  of  their  small  sample  properties 
and  their  behavior  during  early  stages  of  the  empirical  Hayes  estimation  as 
compared  with  the  sequence  (1*1). 

If  the  sample  sizes  at  the  various  stages  are  equal,  *  m,  n*l,  2, 
the  estimator  puts  equal  weights  on  each  of  the  previous  n  sample  dis¬ 

tribution  functions.  In  some  situations,  it  might  be  desirable  to  place  more 
weight  on  samples  which  occur  at  the  most  recent  stages  than  those  which  are 
observed  at  the  beginning  of  the  process.  A  sequence  of  estimators  which  is 
appeal ing  in  this  sense  is  defined  by 

(2.3)  Cl(t)  ‘  <<‘>  +  2-  ••• 

*  A  *  r  * 

where  G,  (t)  -  F, (t).  The  next  theorem  shows  that  G  e  1G  tl)  is  not  exactly 
1  1  n+1 

asymptotically  optimal  relative  to  a*  but  can  be  made  e -asymptotically  optimal 
as  discussed  after  the  proof. 

Theorem  2.2.  As  n  <®,  Rn+^(G  ,(*)  converges  to  tl  +  a(R) /  (2a(R)+m  )  ]R(a)  . 
Proof .  First,  we  write  G^^Ct)  as 

Ci(°  ■  p>i<c) + pr1(i_p.)F2(t)  +  ••• 

+ p.^v** +  <i-p.)pn+i<t>- 

Now,  similar  to  Equation  (2.12)  of  Korwar  et  al  (1976),  it  can  be  shown  that 

W“‘-a>  •  *<“)  (?.(t)  -  Cl*11'2"'')- 

— 1  — n 

After  some  straightforward  algebra  and  applying  Leomta  2*1,  it  is  easy  to  show 
that  as  n  » 
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(2.4)  R^tt^.a)  -*■  R(a)  +  ta2(R) / (a(R)+m)  (2a(R)+m)  (a(R)+l)  ] 

x/F0(t)(l-FQ(t))dW(t). 

However,  according  to  Equation  (2.19)  of  Korwar  et  al  (1976), 

R(a)  =  Ca(R)/(a(R)+l)(a(R)4m)]/F0(t)(l-F0(t))dW(t). 

Thus,  after  simplification,  (2.4)  becomes 

Rn+l(G*’a)  *  (1  +  a(R)/(2a(R)+m))R(o).  /// 

Note  that  if  we  increase  the  sample  size  m,  the  difference  between 
iim  R  .  1  (G  ,a)  and  R(a)  will  become  smaller,  and  we  can  call  {G*  .} 

n-^oo  11+1 

C-asymptotically  optimal  relative  to  a  in  this  case,  since  for  any  e  >  0 

* 

we  can  choose  m  so  that  lim  R  +^(G  ,a)  is  within  £  of  R(a)  . 

n-+°°  n 

The  second  sequence  of  estimators  which  we  consider  is  defined  by 

(2.5)  WO  *  ■>.  JnM  *  (l-p.  )P„+1(0.  n-1.2 . 

n+1  n+1 

where  is  the  sample  distribution  function  of  the  pooled  observations 
X i , . . .  •  Note  that  Is  exactly  the  same  as  G^^t)  when  *  m 

for  each  n.  However,  the  asymptotic  optimality  of  {H^^}  for  the  case  that 
the  sample  sizes  are  not  constant  requires  a  restriction  on  the  sample  sizes 
at  each  step  as  the  next  theorem  shows.  This  condition  results  from  the  fact 

A 

that  the  pooled  sample  from  which  S  is  obtained  is  of  size  K  ■  7?  •  « . • 

n  n  Li*l  i 

Theorem  2.3.  For  unequal  sample  sizes,  the  sequence  of  estimators 
H  *=  {H^^}  is  asymptotically  optimal  relative  to  a  if  and  only  if  -►  » 


as  n  “►  °®. 
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(2.6) 

where 

(2.7) 


v*T\ 

Proof ;  Let  K  *  ) .  ,  m, .  Similar  to  the  proof  of  Theorem  2.2,  we  have 
-  n  <-i®l  1  r 

■  R„vi«,)  +  (?»  M-h 

—1  -n  n+1 


Ey  y  (Pm  (t)  -  H  i(t))2  ■  p  {F2(t)-2Fn(t)ElS(t)J 
— 1  ’  ‘  ’^n  Vu  n+1  mn+l  0  0  n 

+  E[S2(t)]}. 
n 


Applying  Lemma  2.1  to  the  expectations  on  the  right  side  of  (2.7) 9  equation 
(2.6)  becomes 


(2.8) 


Rn+1(H,0°  "  C1  +  a^R)(a(R>+V/Kn(a(R)^n+l)]Rn+l(a) 


Hence,  R  .  t  (H,a)/R  -  (a)  ^1  asm  /// 

n+1  n+1  n 


We 


can  compare  the  performance  of  the  estimator  to  that  of  the  sample 


distribution  function  F  . .  at  each  stage.  The  following  corollary  to  Theorem  2*3 

n+1 

shown  that,  under  certain  mild  conditions  on  the  sample  sizes  m  , ,  f  H  .  _  is 

n+1  n+1 

better  than  the  sample  distribution  function  in  the  sense  that  has  smaller 

overall  expected  Joss. 

Corollary  2.1.  For  each  n  =  1,2,...,  R(F  (^,q)  >  R  j(H,a)  if  and  only 

If  K  >  m  . 
n  n+1 

Proof .  From  equation  (3.3)  of  Korwar  et  al  (1976), 


(2.9) 


R(Fn+l’°°  *  C1  +  aOO/m^DR^Ca). 


Hence,  comparing  (2.8)  and  (2.9),  the  result  follows.  /// 


We  have  considered  the  asymptotic  optimality  of  the  proposed  sequences  of 


estimators  oi  a  distribution  function  in  an  empirical  Bayes  setting.  In 
general,  however,  the  comparison  of  the  three  sequences  for  small  values  of  n 
by  analytical  methods  is  difficult,  if  not  impossible,  Monte  Carlo  simulations 
have  been  performed,  assuming  that  is  a  Weibull  distribution  with  a  known 
sliape  parameter  and  random  scale  parameter  3.  Some  of  the  results  of  the 
simulations  are  given  in  the  next  section. 

3 .  Monte  Carlo  Comparisons 

In  this  section,  we  implement  Monte  Carlo  simulation  of  random  lifetimes 
to  study  properties  of  and  compare  the  empirical  Bayes  estimators  discussed 
in  Section  2. 

v 

The  Weibull  distribution  F(t)  -  l-exp[t  /3J»  (t  £  0),  was  taken  to  be  the 
failure  model  and  was  assumed  to  be  the  "correct"  model  reflecting  past  knowledge. 
With  the  parameter  y  fixed,  we  assume  3  is  randomly  distributed  with  the 
exponential  distribution  as  the  prior  distribution  (Canavos  and  Tsokos  (1973)). 

For  each  fixed  Y,a(R),  and  X  (the  parameter  of  the  exponential  prior 
distribution  for  3)>the  simulations  were  performed  as  follows: 


1.  Fifteen  values  of  3  were  generated  from  the  assumed  exponential  prior 
distribution  with  parameter  X.  The  true  reliability  R(t)  for  the  Weibull  dis¬ 
tribution  was  computed  and  stored  for  each  of  the  15  stages,  where  t  is  chosen 
such  that  R(t)  -  0.4. 

2.  A  sample  of  size  was  generated  from  a  Weibull  distribution  for  each 
ot  the  15  values  of  3,  representing  15  stages  of  the  process.  Three  sequences 
of  estimators  were  then  computed  according  to  (1.1),  (2.3)  and  (2.5),  and  the 
squared  error  between  those  values  and  the  true  reliabilities  were  stored  for 
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3.  With  the  same  15  values  of  $,  step  2  was  repeated  100  times,  and  the 
average  squared  error  was  calculated, 

4.  Steps  1  through  3  were  repeated  100  times  (at  each  time,  15  new  &  values 
were  generated  in  step  I).  The  mean  of  the  average  squared  errors  of  each 
estimator  from  the  true  reliability  stored  in  step  3  for  each  of  the  100 
repetitions  was  computed,  giving  an  estimated  mean  squared  error  (MSE) * 

The  above  procedures  were  repeated  for  several  different  values  of  Y>  <*(R), 

and  X.  Some  of  the  results  of  the  simulations  are  given  in  Tables  1  and  2. 

The  tables  give  the  average  true  values  of  reliability  and  the  MSE's  of  the 

three  sequences  of  estimators  at  each  of  the  15  stages. 

The  results  indicate  that  the  estimated  mean  squared  errors  of  G  are 

* 

generally  smaller  than  those  of  G  at  each  stage  when  the  sample  sizes  are  equal. 

Also,  for  each  of  the  estimators,  the  mean  squared  errors  for  sample  size  10 

are  smaller  than  those  for  sizes  3  and  5.  This,  however,  follows  from  the 

observation  that  0  as  m  -►  00 .  Also,  G  and  H  perform  equally  well  in  the 

sense  that  neither  of  the  MSE's  of  G  or  H  is  uniformly  smaller  than  the  other 

throughout  the  15  stages  when  sample  sizes  are  unequal. 

Hence,  nothing  can  be  said  definitely  about  which  estimator  is  generally 

better  than  either  of  the  other  two  for  small  n.  Obviously,  the  Korwar- 

Hol lander  estimators  G  perform  better  in  the  sense  of  smaller  asymptotic  risk 
* 

than  G  ,  although  for  unequal  sample  sizes  G  and  H  are  very  close.  In  addition 
it  was  observed  that  the  choice  of  the  value  of  a(R)  had  little  effect  on  the 
results  after  the  first  few  stages  of  the  process. 
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